IN THE CLAIMS: 

The text of all pending claims, (including withdrawn claims) is set forth below. Cancelled 
and not entered claims are indicated with claim number and status only. The claims as listed 
below show added text with underlining and deleted text with str i kothrough . The status of each 
claim is indicated with one of (original), (currently amended), (cancelled), (withdrawn), (new), 
(previously presented), or (not entered). 

Please AMEND claims in accordance with the following: 

1 . (CURRENTLY AMENDED) An apparatus for extracting information from a 
formatted document, comprising: an input unit ft for inputting a formatted document; a unit {2) 
for analyzing the input formatted document and saving the particular typographic information; a 
unit (3) for identifying special character strings on the basis of the analysis result by means of 
the typographic information such as font size, character font, color, etc., a unit {4) for extracting 
the identified spec i al special character strings; and an output unit {§) for outputting the 
extracted character strings. 

2. (CURRENTLY AMENDED) The apparatus for extracting information from a 
formatted document according to claim 1 , wherein said unit (3) for identifying special character 
strings determines a certain character string as a special one on the basis of the typographic 
information of said formatted document when the typographic information of said character 
string is determined as a special typographic information. 

3. (CURRENTLY AMENDED) The apparatus for extracting information from a 
formatted document according to claim 1 ef^, wherein said formatted document is HTML 
document, and said unit (3) for identifying special character strings a certain character string as 
a special one on the basis of the analyzing results with respect to said HTML document when 
the font size of said character string is determined to be the biggest one among the surrounding 
character strings. 

4. (CURRENTLY AMENDED) The apparatus for extracting information from a 
formatted document according to claim 1 e^2, wherein said formatted document is HTML 
document, and said unit (3) for identifying special character strings determines a certain 



character string as a special one on the basis of the analyzing results with respect to said 
HTML document when the color and the font of said character string is determined to be a 
special one among the surrounding character strings. 

5. (CURRENTLY AMENDED) The apparatus for extracting information from a 
formatted document according to claim 1 of2, wherein said formatted document is HTML 
document, and said unit (3) for identifying special character strings determines a certain 
character string as a special one on the basis of the analyzing results with respect to said 
HTML document when the font of said character string is determined to be different from the 
surrounding character strings and said character string to be boldfaco boldface . 

6. (CURRENTLY AMENDED) The apparatus for extracting information from a 
formatted document according to claim 1 ef-2, wherein said formatted document is HTML 
document, and said unit (3) for identifying special character strings determines a certain 
character string as a special one on the basis of the analyz i ng analyzing results with respect to 
said HTML document when the color of said character string is determined to be different from 
the surrounding character strings and said character string to be boldface. 

7. (CURRENTLY AMENDED) A method for extracting information from a formatted 
document, comprising the following steps; inputting a formatted document, analyzing the input 
formatted document and saving the particular typographic information; identifying special 
character strings on the basis of the analysis result by means of the typographic information 
such as font size, character font, color, etc.; extracting the identified spec i al special character 
strings; and outputting the extracted character strings. 

8. (CURRENTLY AMENDED) The method according to claim 8, wherein in the step 
of identifying special character string, a colta i n certain character string is determined as a 
special one on the basis of the typographic information of said formatted document when the 
typographic information of said character string is determined as a special typographic 
information. 

9. (CURRENTLY AMENDED) The method according to claim 7 e^8, wherein said 
formatted document is HTML document, and in the step of identifying special character string, a 
certain character string is determined as a special one on the basis of the analyzing results with 



respect to said HTML document when the font size of said character string is determined to be 
the biggest one among the surrounding character strings. 

10. (CURRENTLY AMENDED) The method according to claim 7 e^8, wherein said 
formatted document is HTML document, and in the step of identifying special character string, a 
certain character string is determined as a. special one on the basis of the analyz i ng analyzing 
results with respect to said HTML document when the color and the font of said character string 
is determined to be a special one among the surrounding character strings. 

1 1 . (CURRENTLY AMENDED) The method according to claim 7 €H^8, wherein said 
formatted document is HTML document, and in the step of identifying special character string, a 
certain character string is determined as a special one on the basis of the analyzing results with 
respect to said HTML document when the font of said character string is determined to be 
different from the surrounding character strings and said character string to be boldface. 

12. (CURRENTLY AMENDED) The method according to claim 7e^&, wherein said 
formatted document is HTML document, and in the step of identifying special character string, a 
certain character string is determined as a special one on the basis of the analyzing results with 
respect to said HTML document when the color of said character string is determined to be 
different from the surrounding character strings and said character string to be boldface. 



